Abstract: In this paper a new binarization algorithm for ancient manuscripts and historical documents with bleeding noise has been proposed. This algorithm consists of three primary processes. In the first process, a given gray-scale image has been classified into three classes: black-foreground pixels class, white-background pixels class and confused pixels class. In the second process, the confused pixels class will be classified into either of the two black and white classes. The classified image was cut into rectangles using the confused-pixels vertical and horizontal histograms. Each rectangle is a sub-image containing a region of the image with pixels having similar properties. The third is a voting process where a threshold value is selected to binarize each sub-image separately. Seven thresholding values driven from six different global binarization techniques contribute to the voting process. The binarized image is the collection of the sub-images binarization results. Four different measuring metrics have been used to evaluate the results of the proposed algorithm. The performance of the algorithm has been compared with two widely used binarization algorithms which yield a significant improvement in the binarization process of ancient manuscripts and historical documents with bleeding noise.
Keywords: binarization, thresholding, hybrid algorithm, global binarization algorithms, image measuring metrics